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Abstract 

Wc propose a unified methodology to input non-linear views from any number 
of users in fully general non-normal markets, and perform, among others, stress- 
testing, scenario analysis, and ranking allocation. We walk the reader through 
the theory and we detail an extremely efficient algorithm to easily implement 
this methodology under fully general assumptions. As it turns out, no repricing 
is ever necessary, hence the methodology can be readily applied to books with 
complex derivatives. We also present an analytical solution, useful for bench- 
marking, which per se generalizes notable previous results. Code illustrating 
this methodology in practice is available at 

http : //www . mathworks . com/matlabcentral/f ileexchange/21307. 
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1 Introduction 



Scenario analysis allows the practitioner to explore the implications on a given 
portfolio of a set of subjective views on possible market realizations, see e.g. 
Mina and Xiao (2001). The pathbreaking approach pioneered by Black and 
Litterman (1990) (BL in the sequel) generalizes scenario analysis, by adding 
uncertainty on the views and on the reference risk model. Further generaliza- 
tions have been proposed in recent years. Qian and Gorman (2001) provide a 
framework to stress-test volatilities and correlations in addition to expectations. 
Pezier (2007) processes partial views on expectations and covariances based on 
least discrimination. Meucci (2009) extends the above models to act on risk 
factors instead of returns, and thus covers highly non-linear derivative markets 
and views on external factors that influence the p&l only statistically. 

In the above techniques, the reference distribution of the risk factors is nor- 
mal. The COP in Meucci (2006) explores non-normal markets, but correlation 
stress-testing and non-linear views are not allowed. Furthermore, the COP relies 
on ad-hoc manipulations. 

Here we present the entropy pooling approach (EP in the sequel) which 
fully generalizes the above and related techniques. The inputs are an arbitrary 
market model, which we call "prior", and fully general views or stress-tests 
on that market. The output is a distribution, which we call "posterior", that 
incorporates all the inputs and can be used for risk management and portfolio 
optimization. 

To obtain the posterior, we interpret the views as statements that distort 
the prior distribution, in such a way that the least possible amount of spurious 
structure is imposed. The natural index for the structure of a distribution is its 
entropy. Therefore we define the posterior distribution as the one that minimizes 
the entropy relative to the prior. Then by opinion pooling we assign different 
confidence levels to different views and users. 

Among others, the EP handles non-normal markets; views on non-linear 
combinations of risk factors that impact the p&l directly or only statistically 
through correlations; views on expectations, but also medians, to handle fat 
tails; views on volatilities, correlations, tail behaviors, etc.; lax views, such as 
ranking, on all of the above, thereby generalizing Almgren and Chriss (2006); 
inputs from multiple users and multiple confidence levels for different views. 

Furthermore, in its most general implementation the reference model is rep- 
resented by Monte Carlo simulations, and the posterior which incorporates all 
the inputs is represented by the same simulations with new probabilities. Hence 
the most complex securities can be handled without costly repricing. 

In Section [2] we introduce the EP theoretical framework. In Section [3] we 
present an analytical formula, which generalizes the previous results and pro- 
vides a benchmark for the numerical implementation. In Section|4]we discuss the 
numerical routine to implement the EP in full generality. In Section [5] we illus- 
trate a case study: option trading in a non-normal environment with non-linear 
and ranking views on realized volatility, implied volatility and external macro 
factors. In Section|6]we conclude, comparing the EP to other related techniques. 
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Fully documented code for this and other case studies, such as portfolios from 
ranking, can be downloaded at MATLAB Central File Exchange. 

2 The entropy pooling approach 

We consider a book driven by an iV-dimensional vector of risk factors X. In other 
words, denoting by t the current time, by It the information currently available, 
and by r the time to the investment horizon, there exists a deterministic function 
P that maps the realizations of X and the information It into the price Pt+r of 
each security in the book at the horizon: 

Pt+, = P(X,It)- (1) 

This framework is completely general. For instance, in a book of options X 
can represent the changes in all the underlyings and implied volatilities: in this 
case (ID) is approximated by a second-order Taylor expansion whose coefficients 
are the "deltas", "vegas", "gammas", "vannas", "volgas", etc. Also, X can 
represent a set of risk factors behind a computationally expensive full Monte- 
Carlo pricing function, such as interest rate values at different monitoring times 
for mortgage derivatives. Furthermore, X can be augmented with a set of 
external risk factors that do not feed directly the pricing function ([T]), but that 
still influence the p&l statistically through correlation. We explore a detailed 
example in these directions in Section [5j In any case, we emphasize that X can 
be, but by no means is restricted to, returns on a set of securities. 

The reference model 
We assume the existence of a risk model, i.e. a model for the joint distribution 
of the risk factors, as represented by its probability density function (pdf) 

X ^ /x. (2) 

In BL, this is the "prior" factor distribution. More in general, this is a model 
that risk managers use to perform risk analyses, such as the computation of 
the volatility, tracking error, VaR, expected shortfall of a portfolio, along with 
the contributions to such measures from the different sources of risk. Portfo- 
lio managers and traders on the other hand use this model to optimize their 
positions. They specify a subjective index of satisfaction S, such as the mean- 
(C)VaR trade-off, or the certainty equivalent stemming from a utility function, 
or a spectral measure, etc., see examples in Meucci (2005). Satisfaction depends 
both on the market distribution /x through the prices ([T]) and on the positions 
in the book, represented by a vector w. Then the optimal book w* is defined 
as 

w* = argmax{5 {w; /x)} , (3) 
wee 

where C is a given set of investment constraints. The reference model ^ can be 
estimated from historical analysis, or calibrated to current market observables, 
see Meucci (2009). 
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The views 

In the most general case, the user expresses views on generic functions of the 
market gi (X) , . . . , gx (X)- These functions constitute a X-dimensional random 
variable whose joint distribution is implied by the reference model 

V = g(X)^/v. (4) 

We emphasize that, unlike in BL, in EP we do not assume that the functions 
Qk be linear. Notice that, as a special case, one can express views also on the 
securities values ([T]). 

The views, or the stress-tests, are statements on the variables ^ which 
can clash with the reference model. In a stochastic environment, this means 
statements on their distribution. Therefore, the most detailed possible view 
specification is a complete, subjective joint distribution for those variables: 

V ^ /v ^ /v. (5) 

However, views in general are statements on only select features of the distri- 
bution of V. 

• The classical views a-la BL are statements on E{Vfc}, the expectations of 
each of the 14 's according to the new distribution /v- Since for distribu- 
tions such as stable distributions the expectation is not defined, in EP we 
consider views on a more general location measure fh{Vk}, which can be 
the expectation or the median. The views are then set as 

TO{Vfc}=TOfe, fc = l,...,A', (6) 

The values can be determined exogcnously. If the user has only qual- 
itative views, it is convenient to set as in Mcucci (2010) 

ruk = m {Vk} + x(7 {Vk} ■ (7) 

In this expression cr is a measure of volatility in the reference model, such 
as the standard deviation or, in fat-tailed markets with infinite variance, 
the interquartile range; and x is an ad-hoc multiplier, such as —2, —1, 1, 
and 2 for "very bearish", "bearish", "bullish" and "very bullish" respec- 
tively. 

• The generalized BL views ^ are not necessarily expressed as equality 
constraint: EP can process views expressed as inequalities. In particular, 
EP can process ordering information, frequent in stock and bond manage- 
ment: 

m{Vi}>m{V2}>--->m{VK}. (8) 

• Views can be expressed on the volatilities. A convenient formulation reads: 

a{Vk} = x<j{Vk), k^l,...,K. (9) 
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Correlation stress-tests are also views. Convenient specifications for the 
correlation matrix C {V} are the homogeneous shrinkage 

C{V} = piI + p2C{V} + p3ll', (10) 

where < pi, P2, Pa < 1, Pi + P2 + Ps = 1, I is the identity matrix and 1 
is a vector of ones. For different structures see e.g. Brigo and Mercurio 
(2001). 

The user can input views on the lower (upper) tail behavior, as represented 
e.g. by Qv (u), the quantile of Vk according to the new distribution /v, 
where the tail level u is close to zero (one). A convenient specification is 



{u) = Qv{u), (11) 



where Qv is the reference quantile induced by /v, or alternatively bench- 
mark quantiles such as the normal or the Student t. 

Lower (upper) tail codcpendencc, as represented by Cv (u), the cdf of the 
copula of V at joint threshold levels u close to zero (one) . A convenient 
specification reads 

Cv (u) I xC^r (u) , (12) 

where Cv is the reference copula cdf induced by /v, or alternatively bench- 
mark copula cdf 's such as normal or Student t. 

The above is a very partial list of all the possible features on which the user 
can wish to express views, and which can be handled by the EP. 
The posterior 

The posterior distribution should satisfy the views without adding additional 
structure and should be as close as possible to the reference model ([2]). 

The relative entropy between a generic distribution /x and a reference dis- 
tribution /x 

£ (/x, /x) - y /x (x) [in /x (x) - In /x (x)] dx. (13) 

is a natural measure of the amount of structure in /x; furthermore, it also 
measures how distorted /x is with respect to /x. Indeed, if the two distributions 
coincide, relative entropy is zero; by imposing constraints on /x this distribution 
departs from /x and relative entropy increases. 

Therefore, we define the posterior market distribution as 

/x = argmin{f (/,/x)}, (14) 

where / G V stands for all the distributions consistent with the views statements 
such as ©-(mi). 
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Entropy minimization is widely applied in physics and statistics, see Cover 
and Thomas (2006). For applications to finance, see e.g. Avellaneda (1999), 
D'Amico, Fusai, and Taghani (2003), Cont (2007) and Pezier (2007). In our 
context, entropy minimization is even more natural, as it generalizes Baycsian 
updating, see Caticha and Giffin (2006). 

The confidence 

One last step is required: the posterior /x follows by assuming that the practi- 
tioner has full confidence in his statements. If the confidence is less than full, 
the posterior distribution of the factors must shrink towards the reference factor 
distribution. This is easily achieved as in Meucci (2006) by opinion-pooling the 
reference model and the full-confidence posterior: 

(l-c)/x + c7x. (15) 

The pooling parameter c S [0, 1] represents the confidence level in the views: 
in the extreme case when the confidence is total, the full-confidence posterior 
is recovered; on the other hand, in the absence of confidence, the reference risk 
model is recovered. 

Opinion pooling becomes very useful in a multi-manager context. Indeed, 
consider S users that input their separate views on (possibly, but not necessarily) 
different functions of the market. As in (ITil) . we obtain S full-confidence pos- 

terior distributions /x j s = 1, ... ,5*. Then the posterior distribution results 
naturally as the confidence-weighted average of the individual full-confidence 
posteriors: 

/x^EcJi^'- (16) 

s=l 

These confidence levels can be linked naturally to the track-record of the respec- 
tive manager, i.e. the s-th confidence Cs can be set as an increasing function of 
the number of past views, i.e. seniority, and of the correlation of these views 
with the actual market realization, in the same spirit as the "skill" measure in 
Grinold and Kahn (1999). 

The definitions ([T5|) - (fT6)) follow from a probabilistic interpretation of the 
confidence: one can easily specify different confidence levels for the different 
views of the same user and integrate these within a multi-user context. As it 
turns out, this amounts to specifying a probability measure on the power set of 
the views: we discuss these simple rules in detail in Appendix IA.4I 

We emphasize that, unlike in BL, in EP the confidence in the views and 
the views on volatility ([9]) are modeled separately: indeed, being sure about 
future volatility and being uncertain about future market realizations are two 
very different issues. 

Limit cases 

If the practitioner has no views, i.e. V is the empty set in ([Til) , then the 
confidence- weighted posterior distribution equals the reference model /x. 

On the other extreme, if the views fully specify a joint distribution ([S]) the 
minimization (|14p is not necessary. Indeed, consistently with the principle of 
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minimum discrimination information, the full-confidence posterior follows from 
its conditional-marginal decomposition: 

/x (x) = / /x|v (x) /v (v) dv. (17) 

In particular, this is the case in scenario analysis, where the user associates full 
probability to one single scenario g (X) = v: the views are represented with a 
Dirac delta centered on the scenario /v (v) = 5 (v — v), which, substituted in 
p7)) . yields /x = /x|v- In words, the full-confidence posterior distribution is 
simply the reference distribution, conditioned on g (X) assuming the scenario 
values V. Therefore, EP includes full-distribution specification and standard 
scenario analysis as special cases. 



3 An analytical formula 

Consider as in BL a normal reference model 

X-N(/^,S). (18) 

Consider views on the expectations of arbitrary linear combinations QX and 
on the covariances of arbitrary, potentially different, linear combinations GX 

CoujGX} = Sg, 

where Q, G, Sg and /Iq arc conformable matrices/ vector. 

As wc show in Appendix lA.il the full-confidence posterior distribution (|14p 
is normal: 

X ^ N (/I, S) , (20) 

where 

/I = + SQ' (QSQ') (mq - Qm) , (21) 
S = S + SG' ((GSG')"^ Sg (GSG')^^ - (GSG')"^) GS. (22) 



Then the confidence-weighted posterior distribution pSI) is a normal mixture: 

N (fi, S) (probability: 1 - c) 

X - (23) 

\ 

n(^/I, (probability: c) 

This distribution is suitable for instance to stress-test market crashes, where 
high volatilities, high correlations and low expectations in /I, S are expected to 
occur with probability c ^ 1. 
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Formula ([^^ generalizes results in Pezier (2007). Also, the special case of 
full-confidence c = 1 on only one set of linear combinations Q = G yields the re- 
sult in Qian and Gorman (2001): this is not surprising, as the authors' approach 
is equivalent to the decomposition ([T7)) . Finally, the further specialization to 
null dispersion in the views Sq 0, yields scenario analysis as in Meucci 
(2005), which in turn generalizes the standard regression-based approach that 
appears e.g. in Mina and Xiao (2001). 

4 Numerical implementation 

Except for the special case in Section [31 the EP cannot be implemented ana- 
lytically. However, the numerical implementation of the EP in full generality is 
extremely simple and computationally efficient. 

First, we represent the reference distribution ([2]) of the market X in terms 
of a J X panel X of simulations: the generic j-th row of X represents one in a 
very large number of joint scenarios for the N variables X. whereas the generic 
n-th column of X represents the marginal distribution of the n-th factor A„. 
With the scenarios we associate the J x 1 vector of the respective probabilities 
p. whose each entry typically, but not necessarily, equals 1/J, see Glasscrman 
and Yu (2005) for a variety of methods to determine p. 

We assume that each of the joint scenarios in X has been mapped into the 
respective joint price scenarios for the / securities in the market considered by 
the user, by means of the potentially costly function ([T]), thereby generating a 
J X I panel of prices V. The panel of the security prices V, along with the 
respective probabilities p, is then analyzed for risk management purposes, or it 
is fed into an optimization algorithm to perform the asset allocation step 

The user expresses views on generic non-linear functions of the market 
Their distribution as implied by the reference model is readily represented by 
the J X K panel V defined entry-wise as follows: 

V,^k^gk{X,s,...,X,^N), (24) 

To represent the posterior distribution of the market that includes the views, 
instead of generating new simulations, wc use the same scenarios with different 
probabilities p. Then, as we show in Appendix IA.21 general views such as 
(p|)- P^ can be written as a set of linear constraints on the new, yet to be 
determined, probabilities 

a < Ap < a, (25) 

where A, a and a are simple expressions of the panel ((24|). For instance, for 
standard views on expectations A = V' and a = a quantify the views. 

Furthermore, the relative entropy (|13p becomes its discrete counterpart 

J 

f (P, P) = E [In ) - In )] • (26) 
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X, □N(0,1) X, □N(0,1) 




Figure 1: Entropy pooling: numerical approach matches analytical solution 

Therefore, the full-confidence posterior distribution (|T4l) is defined as 

p = argmin{£:(f,p)}. (27) 

a<Af<a 

This optimization can be solved very efficiently: as we show in Appendix lA.3[ the 
dual formulation is a simple linearly constrained convex program in a number 
of variables equal to the number of views, not the number of Monte Carlo 
simulations, which can be kept large. Therefore we can achieve an excellent 
accuracy even under extreme views, see Figure [TJ 

Now it is immediate to compute the opinion-pooling, confidence-weighted 
posterior (fTS)) : this is represented by {X,pc), the same simulations as for the 
reference model, but with new probabilities 

p, = (l-c)p + cp. (28) 

A similar expression holds for the more general multi-user, multi-confidence 
posterior discussed in Appendix IA.4I 

Since the posterior factor distribution is obtained by tweaking the relative 
probabilities of the scenarios X without affecting the scenarios themselves, the 
posterior distribution of the market prices is represented by (P, Pc), the original 
panel of joint prices and the new probabilities. Hence no repricing is necessary 
to process views and stress-tests. 
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5 Case study: option trading 



As in Meucci (2009), we consider a trader of butterflies, defined as long positions 
in one call and one put with the same strike, underlying, and time to maturity. 
The price Pt+r of the butterfly at the investment horizon can be written in the 
format (H]) as a deterministic non-linear function of a set of risk factors and 
current information. Indeed 

Pt+r = BS {yte^'y , h {yte^'y , at + X,, K,T ~ t) ; K,T ~ T,r) . (29) 

In this expression r is the investment horizon; yt is the current value and Xy = 
\n. [yt+T I yt) is the log-change of the underlying; ut is the current value and 
Xa = CTt+T — o't is the change in ATM implied volatility; BS is the Black- 
Scholes formula 

BS {y, a; K, T,r) = y [$ (di) - $ (-di)] - Ke'^^ [$ (da) - * (-^2)] , (30) 

where $ is the standard normal cdf; K is the strike; T is the time to expiry; 
r is the risk-free rate; di = {\n{y/K) + {r + T) /a^/T, d2 = di - as/T; 

and h is a skew/smile map 

h{y,a;K,T) = (7 + a — + P ^ ^ j , (31) 

for coefficients a and /3 which depend on the underlying and are fitted empir- 
ically, similarly to Malz (1997). If the investment horizon r is short, a delta- 
gamma- vega approximation of (j29p would suffice. However, we leave the exact 
formulation to demonstrate how the present approach does not require costly 
repricing. 

Consider a portfolio represented by the vector w, whose generic i-th entry 
is the number of contracts in the respective butterfly. The p&l then reads 

/ 

Hw = ^ {P^ (X,XO - P,,t) , (32) 
i=i 

where Pi (X,It) is the price at the horizon (P^]) and Pi^t is the currently traded 
price of the i-th butterfly. We assume that, in order to account for market 
asymmetries and downside risk, the trader optimizes the mean-CVaR trade-off. 
Therefore ^ becomes 

wa = argmax {E {U^} - A CVaR^ {Hw}} , (33) 

b<Bw<b 

where 7 is the CVaR tail level; and B. b, and b are a matrix and vectors that 
represent investment constraints. 

To illustrate, we set 7 = 95%, we impose that the long-short positions offset 
to a zero delta and a zero initial budget, and that the absolute investment in 
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each option does not exceed a fixed threshold. We set the investment horizon as 
T = 1 day. We consider a hmited market of / = 9 securities: 1-month, 2-month 
and 6-month butterflies on the three technology stocks Microsoft (M), Yahoo 
(Y) and Google (G). 

In addition to the respective underlyings and implied volatilities, we include 
the possibility of views on growth or inflation, as represented by the slope of 
the interest rate curve: therefore we add the changes in the two- and ten-year 
points of the curve, for a total of = 14 factors: 

X = (X*^, X2mi XqI^, . . . , , Xqj^, X2y,Xiay) . (34) 

To determine the reference distribution ^ of these factors we consider the panel 
of joint observations of the factors over a three-year horizon: this amounts to 
700 observations. To achieve J = 10^ joint simulations we kernel-bootstrap 
the historical scenarios: for each historical observation Xt, wc draw 10^/700 

observations from the multivariate normal distribution N ^Xt,eS^, where S is 

the sample covariance and we set e = 0.15. The juxtaposition of the above 
simulations yields the desired J x N panel X, where each scenario has equal 
probability = 1/J. 

Then we input each scenario of X into the pricing function (j30p . obtaining 
the joint p&l scenarios V with equal probabilities p. The sample counterpart 
of the mcan-CVaR efficient frontier ([33)1 reads 

WA ^ argmax ( (w'T^'p) + A^t^ 1 , (35) 

b<Bw<b I [P] [1] J 

where the operator [x] selects in the generic vector x only the entries that 
correspond to the (1 — 7) J smallest entries of Vw. If J is not too large this can 
be solved by linear programming as in Rockafellar and Uryasev (2000). For very 
large J we solve this heuristically as in Meucci (2005) by a two-step approach: 
first determine the mean-variance efficient frontier, then perform a uni-variate 
grid search for the optimal trade-off (|35|) . 

In Figure [2] we display the frontier ensuing from the reference market model 
in our example. For the extreme case of zero risk appetite, not investing at all 
is optimal. As the risk appetite increases, leverage increases, always respecting 
the constraint of a zero net initial investment, as well as delta-neutrality. When 
the risk appetite increases further, the remaining constraints enter the picture. 

Now we consider the views of three distinct analysts. The first one is bearish 
about the 2m-6m implied volatility spread for Google. From (|6])-(l7l) this means 

E {X^^ ~ Xg^ )<^{X2^~ Xg^ } - a {X§^ - X§^ } . (36) 

This view is represented in the form (|25p as 

J 

Y^p^P (X« ™ - Xf^2^) < m6|2 - ^6|2, (37) 
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Figure 2: Mean-CVaR long-short efficient frontier: prior risk model 

where m6|2 and CTg|2 are the sample counterparts of the respective terms in 
([36]). We can compute p^^^ as in (f27| . under the constraint ([37| . To iUustrate, 
we show In Figure [3] the mean-CVaR efficient frontier (|35|) when this view is 
processed: as expected, the G6m-G2m spread, previously long, is now short. 




□ 200 400 600 GOO 1000 1200 1400 

CVaR • 



Figure 3: Mean-CVaR long-short efficient frontier: view on G6m-G2m spread 

The second analyst is bullish on the realized volatility of Microsoft, defined 
as |^*^|, the absolute log-change in the underlying: this is the variable such 
that, if larger than a threshold, a long position in the butterfly turns into a 
profit. Since this variable displays thick tails and the expectation might not be 
defined, see e.g. Rachev (2003), we issue a relative statement on the median, 
comparing it with the third quintile implied by the reference market model: 

M{\X''\}>Q^XM^(1). (38) 
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This view is represented in the form (pS)) as 

where J is the set of indices j such that \Xj^\ is smaher than the sample third 
quintile of , see Appendix I A. 21 Now we can compute p'^^ as in ((27)) under 
the constraint 

The third analyst believes that the slope of the curve will increase by five 
basis points. Therefore he formulates the view a-la BL, using in ^ expectations 
and binding constraints: 

,7 

(^..10, - X,^2y) ^ 0.0005. (40) 
and p^'^-' can be computed as in ([27)) . 




□ 200 400 600 GOO 1000 "1200 1400 1600 1000 2000 
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Figure 4: Mcan-CVaR long-short efficient frontier: all views 

The management committee attributes ci = 0.20, C2 = 0.25 and C3 = 0.20 
confidence on the analysts' views, the remaining portion being attributed to the 
reference model. Then the uncertainty-weighted posterior probabilities read 

3 

Pc^5]c,p(^), (41) 

s=0 

where cq = 1 — ci — C2 — C3 and p*-*^' = p. We show in Figure |4| the combined 
effects of all the views on the frontier ([55)). 

We emphasize that in this case study the market has a non-parametric, 
thick-tailed, non-normal distribution; two views are expressed as inequalities; 
one view acts on a non-linear function, the absolute value, of a factor; the slope 
of the curve in one view is an external factor that appears nowhere in the pricing 
function of the securities; features different from expectations are being assessed, 
namely the median; and no repricing was ever necessary. 
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6 Conclusions 



Wc present the EP. a unified framework to perform trading, portfolio manage- 
ment and generalized stress-testing in markets with complex derivatives driven 
by non-normal factors. The inputs are a possibly non-normal reference market 
model and a set of very general equality or inequality views on a variety of 
features of the market. The output is a posterior distribution that incorporates 
all the inputs. As it turns out, the EP avoids costly repricing by representing 
the posterior distribution in terms of the same scenarios as the reference model, 
but with different probabilities whose computation is extremely efBcient. 

We summarize in the table below the capabilities of the EP as compared 
to Black and Litterman (1990), Almgren and Chriss (2006), Qian and Gorman 
(2001), Pezier (2007), Meucci (2009) and the COP in Meucci (2006). 
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A Appendix 

In this appendix we present proofs, results and details that can be skipped at 
first reading. 



A.l The analytical solution 

Using the explicit expression for the multivariate normal pdf 

In (x) ^ - y In (27r) - i In |S| - 1 (x - ^i)' (x - /i) (42) 
we can compute the Kullback-Leibler divergence between normal distributions: 
Dkl (z^^s, ^ (x) In 4g (x) dx (43) 

-iE{(X-/7)'S-i(X-M)} 



N 1 
In (2n) - - In 

y In i2n) + i In |S| + { (X - ^,)' (X - /i)} 



2 
1 



1 



E{(X-A*) (X-At)'}S 

iV 1 

Y+2 
N 1 

Y+2 



E{(X-m) (x-m)'}s-i 



■tr 
■tr 



+ -(/I-/i)'S-i(/I-Ai) 

Our purpose is to minimize the Kullback-Leibler divergence (|43|) under the 
constraints ()19p . Using the following matrix identity 



vec (r)' vec (A) = ^ r^.A^, ^ tr (F'A) 



(44) 



we write the Lagrangian as 

£ - i (m - m)' (Ai - A^) + ^ tr (S-^S) - ^ In 
- A' (Q/I - Mq) - ^ tr (r' (gSC - Sg 
The first order conditions for Jl read 

o = ^ = s-1(aI-m)-Q'a, 



(45) 



(46) 
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or equivalently 

Jl'H^ SQ'A. (47) 
Pre-multiplying by Q both sides this iniphcs 

A= (QSQ')"'(/lQ-QAi). (48) 
Substituting this in (|T7)) wc obtain 

/I = /i + SQ' (qsq') \T^Q-Q^^) (49) 



To determine the first order conditions for S we first use the identity in 
Minka (2003) 

dln|X| = tr(X-idX) (50) 

and the symmetry of T to express the differential of the Lagrangian witli respect 
to S as follows: 

d£ = i tr (S~i<is) - i tr (^S-^rfs) - i tr (^GTGds) . (51) 
Using again ([^^ to setting ([?T|) to zero we obtain: 

- GTG (52) 
Using the following matrix identity (A and D invertible, B and C conformable) 

(A - BD^^C)"^ = A-^ - A^B (CA^^B - D)"^ CA~\ (53) 
we can write ([52]) as 

S = (S"i - GTG)"^ (54) 
= S - SG' (GSG' - T-^y^ GS. 

Using the constraints 

Eg = GSG' = GEG' - GEG' (GSG' - r^^)"^ GSG' (55) 

or ^ 

(GSG' - T-^y^ = (GSG')"^ - (GSG')"^ So (GSG')"^ (56) 

Substituting this result back into ([M]) yields 



S = S + EG' (^(GSG') Eg (GSG') -(GSG') j GS. (57) 

A. 2 Views as linear constraints on the probabilities 

Since this change is fully defined by the reference and the posterior distribution 
of the views V, to determine p we need only focus on this lower dimensional 
space instead of the whole market X. 



17 



A. 2.1 Partial information views 
• Views a-la Black Litterman 
The generalized BL bullish/bearish view reads 

m{Vk}%mk. (58) 
We can define ruk exogenously. Alternatively, as in ([7]) we set 

mk = rhk + >c^k, (59) 

where fhk is the sample mean of the fc-th column of the panel V based on the 
prior probability 

,/ 

fhk = ^PjVj,k, (60) 

and CTfc is its sample standard deviation of the fc-th column of the panel V based 
on the prior probability 

,7 

^k^^P3 0^3,k-mkf . (61) 

Alternatively, we set nik in ([55)1 as the sample (^ + f )-tile of the fc-th column 
of the panel V based on the prior probability 

™fc = ^s{i),k- (62) 

In this expression s is the sorting function of the fc-th column of the panel V, 
i.e. denoting by Vi.j^k the i-th order statistics of the fc-th column the function 
s is defined as 

Vs(i),fc = Vi:j,fc, i = l,...,J; (63) 

and the index / satisfies 

7 = argmax <j ^ < ( 1 + ^ J [> . (64) 



I 



. 4=1 



To express ((58)) as in ([25)) we first consider the case where m{T4} is the 
expectation. Then its sample counterpart is the sample mean and ()58[) reads 



J 

> 



^PjVj-fc=mfe, (65) 
On the other hand, if m{Vk} in ([55)) is the median, then the view reads 

T.P^%1^ (66) 
where Ik denotes the indices of the scenarios in V.,fc larger than rUk- 
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• Relative ranking 

The relative ordering view 

m{Vi}>m{V2}>--->m{VK}, (67) 

when the location parameter is expectation translates into the following set of 
linear constraints: 

,/ 

; (68) 

J 

• Views on volatility 

A view on volatility reads 

a{Vk} = ak. (69) 

First we consider the case where cr {14} is the standard deviation. Then (j69)) 
can be expressed as in (^5)) as 

J 

Y.P,Vl,=ml + al (70) 

where fhk is the sample mean of the fc-th column of the panel V. The benchmark 
tJfe can be set exogenously. Alternatively, we set 

ak = x^k, (71) 

where (Jfe is the sample standard deviation of the fc-th column of the panel V. 

When a {Vk} in ([69]) is the range between the (i — 7)-tile and the + 7)- 
tile of the distribution of Vk we proceed as follows. First, compute the sample 
( i — K7) -tile V J, of the fc-th column of the panel V as in (|62l) and similarly the 
sample ( ^ -I- K7) -tile Vk ■ Then the view reads 

II i ^ - 7, ^ Pj i ^ - 7- (72) 

where 7^. denotes the scenarios in the fc-th column of V that are smaller than 
and /j, denotes the scenarios that are larger than Vk ■ 

• Views on correlations 
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To stress test the correlations with a pre-defined matrix such as pUj) we 
impose 

J ^ 
^PjVj,fcVj,; = rhkm + dkdiCkd, (73) 
i=i 

where is the sample mean and Uk is the sample standard deviation of the 
fc-th column of the panel V . 

• Views on tail codcpendence 

First we extract the empirical copula from the panel V as in Meucci (2006): 
we sort the columns of V in ascending order; then we define a panel whose 
generic (j, fc)-th entry is the normalized ranking of Vj,k within the fc-th column 
(for instance, if V^^i is the 423-th smallest simulation in column 7, then lA^^r = 
423/ J). Each row of U represents a simulation from the copula of /v- 

Stress-testing the tail codcpendence means 

Cv (u) I C, (74) 
where C can be set exogcnously. This translates into 

where denotes the scenarios in U that lie jointly below u. To better tweak C a 
convenient formulation is as the sample counterpart of >rCv (u), for a reference 
copula Cv computed as above. 



A. 2. 2 Full-information views 
• Views on copula 

If a full copula is specified, we draw a J x K panel of simulations U from 
it. To do so, we can fit to a parametric copula \Jg that depends on a set of 
parameters 9; then U is obtained by drawing from the copula Ug, where 6* is a 
perturbation of estimated parameters 6. 

Then p is determined by matching all the cross moments 

J _ _ 

'^PjUj^kl^jj = y^^PjUj^kl^j^i, k > I ^ 1,. . . ,K (76) 

J J 
'^PjUj^kUj^iUj^i = y^PjUj^kUjfij^i, k > I > i ^ 1,. . . ,K (77) 
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and as well as all the marginal moments of the uniform distribution 

J 



Ep.^xfc = \ (78) 
ilP^^lu = \ (79) 



up to a given order. 

• Views on marginal distributions 

If a full marginal distribution for the fc-th view is specified, we draw a J x 1 
vector of simulations V.^k from it. Then p is determined by matching all the 
moments up to a given order: 

(80) 
(81) 
(82) 



• Views on joint distribution 

If a full joint view distribution ([S]) is specified, we draw a J x K panel of 
simulations V from it. This can be done in one shot, or by paring a desired 
copula with desired marginals as in Meucci (2006). Then p is determined by 
matching all the cross moments up to a given order: 

J J ^ 

T.P^Vi.k = Y.PiV,^'^^ k = h---,K (83) 

J ,/ 
Y.P^^i^^'^^^i = T.Pi^i^^i-1^ k>l = l,...,K (84) 

J _ _ _ 

Pi^J-k^J^i' "^J,' = Pj^i^kVj-i^J-^' k>l>i^l,...,K (85) 



J 
























Eft(v,.fe)' 


i=i 
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A. 3 Numerical entropy minimization 

The entropy minimization problem (|27p reads explicitly 



p = argmin <| ^ Xj (In (xj) - In (p^)) ^ , (86) 



Fx<f 
Hx=h 



where we have collected all the inequality constraints in the matrix-vector pair 
(F,f), all the equality constraints in the matrix- vector pair (H, h) and where 
we do not include the extra-constraint 

X > (87) 

because it will be automatically satisfied. 
The Lagrangian for ([55)) reads 

£ (x, A, i^) = x' (In (x) - In (p)) + A' (Fx - i) + i^' (Hx - h) . (88) 

The first order conditions for x read 

= — = In (x) - In (p) + 1 + F'A + HV. (89) 

The solution is 

x(A,i.) =e'"(P)-i-^'^-"'". (90) 

Notice that the solution is always positive, which justifies not considering ([57]) . 
The Lagrange dual function is defined as 

g{X,u) = C{^{X,,y),X,,^). (91) 

This function can be computed explicitly. The optimal Lagrange multipliers 
follow from the numerical maximization of the Lagrange dual function 

(A*,j.*) = argmax{g(A,i^)}. (92) 

A>0,iy 

Notice that, whereas the Lagrangian should be minimized, the dual Lagrangian 
must be maximized. Also notice that both gradient and Hessian can be easily 
computed (the former from the envelope theorem) in order to speed up the 
efficiency of the algorithm. 

Finally, the solution to the original problem (|86|) reads 

p = x(A*,i^*). (93) 

The numerical optimization (|92p acts on a very limited number of variables, 
equal to the number of views. It does not act directly on the very large number 
of variables of interest, namely the probabilities of the Monte Carlo scenarios: 
this feature guarantees the numerical feasibility of entropy optimization. 
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A. 4 Confidence specification 

We consider five increasingly complex cases. First, there is only one user with 
equal confidence in all his views. Second, there is only one user, but each view 
can potentially have a different confidence. Third, there are multiple users, 
where each user has equal confidence in their own views. Fourth, there are 
multiple users, but each view of each user can potentially have a different con- 
fidence. Fifth, wc propose a general framework to accommodate all possible 
specifications. 

A. 4.1 One user, equal confldence in all views 

This is the case considered in the pooling expression (|15p . The confidence c can 
be interpreted as the subjective probability that the views be correct, instead 
of the reference market model. Indeed, consider the mixture market 

X = (1 - S) X + BX, (94) 

where X is distributed according to the reference model ^ and X according to 
the regime shift p7)) implied by the views. If i? is a 0-1 Bernoulli variable that 
decides between the two regimes with probabilities 1 — c and c respectively, the 
pdf of X is exactly (IT5]) . 

Alternatively, we can represent the Bernoulli variable in (|94l) as follows: 

X = ((7) X + /, ((7) X, (95) 

where J7 is a uniform random variable; and Ic and /i_c arc indicator functions 
of non-overlapping intervals of size c and 1 — c. 

A. 4. 2 One user, views with different confidences 

Consider the case where different views have different confidence levels. Each 
view is a statement such as (|6|- p2|) . 

We illustrate this situation with an example 

index view confidence 

1 m{Vi}>m{V2} 10% (96) 

2 m{V2}>m{V3} 30% 

One could model this situation in a way similar to ([M)) : in 10% of the cases 
only the first view is satisfied and in 30% of the cases only the second view 
satisfied. However, this is not correct. Instead, in 10% of the cases both views 
are satisfied and in 20% of the cases only the second view is satisfied. 

In other words, we are assigning probabilities to the subsets of views com- 
binations as follows: 

subset confidence 
{1,2} C{i,2} = 10% 

{1} c^,y = 0% (97) 
{2} c{2} = 20% 
C0 = 70% 



23 



Then, the posterior reads 

X ^ (U) + /,^,, (U) X{ij + ([/) X{2} + /c(,,„ (C/) X{i^2}. (98) 

In this expression X0 is a random variable distributed according to the reference 
model ([2]); ^{1} is an independent random variable, distributed according to 
the posterior with only the first view, whose pdf, which follows from (|14p . we 
denote by /{i}; similarly for ^{2}] ^{1,2} is an independent random variable, 
distributed according to the posterior from both views, whose pdf we denote 
by /{i,2}j is a uniform random variable; and the /c's are indicators functions 
of the on non-overlapping intervals with size c as in Table 1971 in particular 
/c{ij ([/) is always zero. Then the pdf of (jM]) reads 

/X = C0/X + C{1}/{1} + C{2}/{2} + C{1,2}/{1,2}- (99) 

In general, we start from a set of L views with L potentially different confi- 
dences 

index view confidence 

1 ... Cl 

2 ... C2 (100) 
L ... Cl 

From this, we obtain a probability ca for each subset A of {1,2, . . . , L} as 
follows: 

{1,2,..., L}^ C{i,2,...L} = min £ {1, 2, ... , L}) 
{1,2,...,L-1}^ C{i,2,...,L^i} = min {ci\l £ {1, 2, . . . , i - 1}) 

- C{1,2,...L} 

: (101) 
{2,...,L}i-^ C{2,....L} = min(Q|/ e{2,...,L})~ C{i^2,...L} 
{l,2,...,L-2}^ C{i,2....,L-2} = min (q|/ G {1, 2, . . . , i - 2}) 

- C{1,2,...,L-1} - C{1,2,...L} 

L 

l-» C0 = 1 - Q 

The set of subsets is known as the "power set" and is denoted 2^^' - -^^ . There- 
fore, the views and their confidences are mapped into a probability on the power 
set of the views. 

The posterior is defined in distribution as follows 

IcAU)^A, (102) 

Ae2{i.--.i'} 
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where L'^ is a uniform randora variable; the Jc's are indicators functions of the 
on non-overlapping intervals with size ca as in (|101[) : the "K-a^s are independent 
random variables, distributed according to the posterior with only the views in 
the set A, whose pdf we denote by /a- 

The pdf of the posterior (|102p then reads 

/x= ""aIa. (103) 

Notice that in practice the vast majority of the potentially 2^ subsets will have 
null probability ca and therefore those terms will not appear in (|102|) or (|103p . 

A. 4. 3 Multiple users, equal confidence levels in their views 

This is the case considered in the pooling expression (|16p , which we report here 

/x^E^.i^x^'. (104) 

s=0 

AAA Multiple users, different confidence levels in their views 

More in general, consider S users. The generic s-th user has Ls views with 
potentially different relative confidences, modeled as in (|103p . On the other 
hand, each user has been given an overall confidence level as in (|104p . The pdf 
of the posterior follows from integrating the bottom-up approach (|103|) and the 
top-down approach (|104p as follows: 

^ S ^ 

We remark that in practice the vast majority of the potentially large number 
of the terms ca^ in (|105|) is null. Also this model can be embedded in the 
framework of a probability on the power set of the views, as in (|102p - (|103p . see 
Appendix I A. 4.51 

A. 4. 5 General case 

We can interpret the multi-user, multi-confidence framework as a set oi L = Li + 
■ ■ ■ Ls views with confidences defined as the product of the overall confidence in 
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the user times the relative confidence of the user in his different views. 



user 1: < 



user S: < 



index 

(1,1) 
(1,2) 

I (l,^i) 

index 
(5,1) 
(5,2) 

{S,Ls) 



view conf. 

Cl,l 
Cl,2 

Cl.Li 

view conf. 

CS,1 
CS,2 

CS,Ls 



(106) 



Consider the power set 



2{(i,i),...As,Ls)} _ 
The sum in (|105p can be expressed as 

/x = ^ caJa, 
AeA 



(107) 



(108) 



where the coefficients ca are determined by the integration of the bottom-up 
approach (|103|) and the top-down approach ((T04|): due to this integration only 
very few among all the possible elements A £ A have a non-null coefficient ca- 
However, there are many choices of the c^'s consistent with p06|) . According 
to any such choice, the posterior is expressed in distribution as 



^=J2 IcAu)Xa, 



AeA 



where the same notation as (|102p applies, and the pdf reads 

AeA 



(109) 



(110) 
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